Improving Decision Trees by Clustering

نویسنده

  • Venkat Chalasani
چکیده

Multi-modal classification problems arise in many fields and form an important class of problems. The presence of disjoint areas for each class creates special problems for techniques that cannot partition each class into more than one region. Among the various techniques that have been applied with some success to multi-modal problems are decision tree classifiers (DTCs) and back propagation neural networks. DTCs use a recursive partitioning algorithm to partition the feature space into disjoint areas. In principle, DTCs should perform well on multi-modal data sets because they have the ability to partition each class into disjoint areas. In practice, however, DTCs can have significant difficulty with such data sets. In the presence of multimodality decision trees can became very bushy leading to high error rates. We show that clustering of the data before classification can lead to simpler trees and lower error rates.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Accuracy of Decision Trees Using Clustering Techniques

Data mining is an important part of information management technology. Simply put, it is a method to extract and analyze meaningful patterns and correlations in a large relational database. In Data mining, Decision trees are one of the most worldwide used tools for decision support. In the emerging area of Data mining applications, users of data mining tools are faced with the problem of data s...

متن کامل

Improving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering

Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...

متن کامل

Improving Accuracy using different Data Mining Algorithms

Mining large data set is an important issue to deal with as data is growing as the field grows. Today, crime rate is a menace that each country faces. With the increase in crime rate the data is increasing and it is such a critical field that accuracy is important at the same time. This paper shows the comparison in the results between clustering and the classification. K means is used in clust...

متن کامل

Tree approximation for discrete time stochastic processes: a process distance approach

Approximating stochastic processes by scenario trees is important in decision analysis. In this paper we focus on improving the approximation quality of trees by smaller, tractable trees. In particular we propose and analyze an iterative algorithm to construct improved approximations: given a stochastic process in discrete time and starting with an arbitrary, approximating tree, the algorithm i...

متن کامل

Data Mining Approach to the Sunspot Classification Problem

Sunspots are the subject of interest to many astronomers and solar physicists. Sunspot observation, analysis and classification form an important part of furthering the knowledge about the Sun. Sunspot classification is a manual and thus very labor intensive process that could be automated if it can be successfully learned by a machine. This paper deals with data mining approaches to the proble...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004